Abstract: Data transformation and aggregation is the significant portion in data mining for data analysis and data set preparations. In a relational database environment, building such data set requires joining tables and aggregating columns from different dynamic tables. Several aggregation functions based on the SQL operations have been initiated for multi table aggregation by applying vertical joints. Such previous SQL aggregations are limited since they return a single number static data group. These aggregations worked well in the form of static datasets, but a major effort is still required to build data sets suitable for data mining purposes, where a tabular format is generally required and which need frequent updates. This suggested work proposes a very simple and effective summarization based dynamic join operations over high dimensional dataset. These extents the SQL aggregate functions to produce aggregations in horizontal form, returning a set of numbers instead of single aggregation. The research work also proposes a Multi Class Clustering (MCC) and Weighted PCA method to handle a high dimensional dynamic dataset with summarization technique. In the proposed technique, there are two common data preparation tasks are enlightened which includes transposition/aggregation and transforming categorical attributes into summarized labels. This executes the basic methods to evaluate horizontal aggregations which are named as CASE, SPJ and PIVOT respectively.

Keywords: aggregation, Weighted PCA method, MCC (Multi class clustering)